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A conceptually simple, block-coding feedback strategy which applies to 
all time-discrete, memoryless channels is introduced and examined. This 
strategy provides the first conclusive evidence of an improvement at nonzero 
rates in the reliability of block coding with feedback on the additive Gaussian 
noise channel. This result was previously observed by Berlekamp for the 
binary symmetric channel. 

I. INTRODUCTION 

Coding with feedback has been considered by a number of authors. 1 ,2 • 
3,4,5,6 principally because of the advantage it is expected to enjoy in 
rate, reliability, and equipment costs over coding without feedback. 
Some early feedback results were obtained by Shannon, who showed 
that channel capacity of a discrete memoryless channel (DMC) cannot 
be increased using feedback 1 and established that the sphere-packing 
(lower bound 7,8 ) to the probability of error without feedback applies 
to block coding with feedback on the DMC uniform at the input (the 
sets of transition probabilities from each channel input letter are iden- 
tical except for permutations 7 ). He also conjectured that a sphere- 
packing bound applies to block coding with feedback on all discrete, 
memoryless channels. 2 (A conjecture supported recently by Berlekamp. 9 ) 

It has been shown that the exponent on the sphere-packing bound 
agrees with the no-feedback, random-code exponent at rates above the 
critical rate R CItt , 710 so that feedback cannot increase the reliability of 
block coding at rates gi eater than R crit . Below i2 cr it , Berlekamp first 
showed that feedback will improve the reliability of block coding. 3 He 
showed that the zero rate exponent of the probability of error with block 
codes on the binary symmetric channel (BSC) and certain other binary 
channels having particular symmetries is larger with feedback than the 
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best no-feedback exponent at zero rate. Also, he showed that the error- 
correction capability of block codes on the BSC at rates between zero 
and R cr ,t is improved with feedback. 

Variable-block-length coding strategies, that is strategies where the 
code length is controlled through the feedback channel, have been 
proposed and in contrast with the block coding strategies have been 
shown to operate with a greater reliability than that given by the sphere- 
packing bound. 4,5 Hence, the best variable-length feedback strategy 
is superior to the best block-coding feedback strategy. The block-coding 
strategies, however, are interesting since any improvement in coding 
reliability observed with them can be ascribed directly to the effect of 
feedback on the choice of codewords representing messages. This is not 
true of the variable-length strategies since, in this case, some unknown 
fraction of the improvement in reliability is attributable to the variation 
of the code length with the level of channel noise. 

In this paper, we shall be concerned only with block-coding with 
feedback and in particular with one particular block-coding strategy. 
This is a strategy which makes efficient but not complete use of the 
feedback channel. (This will be seen from the results.) It is recommended, 
however, by its simplicity, its easy evaluation in terms of known, (no- 
feedback) bounds, by its application to a large class of channels including 
all time-discrete, memory less channels, by the substantial improve- 
ments it shows for many channels over the no-feedback, random-code 
exponents and the improvement it shows for some channels over the 
best upper bounds on no-feedback exponents. In particular, we see an 
improvement over the random-code exponents over a middle range of 
rates on the BSC with crossover probability less than 10~ 7 and on the 
average-power-limited, additive Gaussian noise channel (AGC) with 
signal-to-noise ratio (S/N) greater than 11.5 dB. This is interesting, 
since some believe that the random-code exponents are the best, no- 
feedback, block-coding exponents. Also, we see an improvement, again 
over a middle range of rates, over the Wyner upper bound 11 to the no- 
feedback exponent on the AGC with S/N ^ 22 dB. 

It is clear that our strategy does not make full use of the feedback 
channel since the improvements noted apply only to cleaner channels 
and then only over a middle range of rates. Also, it can be shown that 
our strategy has an exponent which is smaller than the exponent implied 
by Berlekamp's results for the BSC. 3 It should be noted, however, that 
the BSC is the only channel for which it has been previously shown that 
feedback can improve the reliability of block coding. 
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II. THE CODING STRATEGY 

The feedback, block-coding strategy which we present was suggested 
by arguments used by Shannon and Gallager to underbound the proba- 
bility of error with block coding. 12 Our strategy, however, leads to an 
overbound to the probability of error with the best feedback strategy 
since it does not make the most efficient use of the return channel. We 
will describe this strategy without referring directly to the input and 
output alphabets of the forward channel. We assume only that the 
forward channel is time-discrete. With respect to the return channel, 
we assume that it is noiseless and of large but finite capacity. The forward 
and reverse channels are allowed to have a total delay of D channel 
symbols. 

Our feedback, block-coding strategy is a 2-step procedure. For each 
of the two steps a block code is chosen and used without feedback. In 
the first step, a codeword from a code of M codewords each having length 
JVi , is chosen to transmit one of the M messages generated by the 
source. In the second step, a code of L codewords* where each word has 
length Ni is used. Here JVi and N 2 are chosen so that Ni + N2 -\- D = N, 
the number of channel symbol intervals allowed for the transmission 
of one of the M source messages. 

The decoder receives a noisy version of the codeword chosen from the 
first code and he then makes a list of the L messages which are most 
likely given the received signal. t We assume that all messages are 
equally likely, a priori, so that the L messages on the list are those 
which have the largest likelihood probability. That is, if p(vi | m) is the 
probability (or probability density, if the channel output alphabet is 
continuous) of receiving the Ni channel letters represented with Vi 
when message m is transmitted, then messages mi , m-t , • • • , thl are 
on the list if 

p(vi I nti) ^ p(vi|m') all m TJffli, 1 ^ i ' ^ L. (1) 

Once the list has been formed, the list and the order in which messages 
appear on the list is sent over the reverse channel to the transmitter. 
Since D time intervals will elapse before the list reaches the transmitter, 
the transmitter is ready to begin the second of the two steps after 
JVi + D intervals. In the second step, the transmitter chooses a code- 
word from the second code of length N 2 to indicate which message on 

* L is arbitrary here but is fixed in later discussion. 
t This decoding procedure is called "list decoding." 8-ls 
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the list ot the L messages is the source output. If the source output is 
not on the list, the first codeword is transmitted. The received sequence 
is decoded using the max-likelihood decoding rule. 

III. EVALUATION OF THE STRATEGY 

Our coding strategy which first establishes a list of most probable 
transmitted messages and then resolves the ambiguity in the list can 
lead to a decoding error in either of two ways. First, the message de- 
livered by the source may not be on the list because of excessive channel 
noise in the first Ni transmissions or, second, it may be on the list but 
the list may be decoded in error during the last N 2 transmissions. 

Now, the probability of a decoding error with the best feedback 
strategy, P e! (N,Ml)> is le ss than or equal to P,(JV,M,1), the probability 
of error, with the feedback strategy given above. This, in turn, is less 
than or equal to the sum of the probability of list decoding error with 
the best list code (and no feedback), P e (iVi , M, L), and the probability 
of a max-likelihood decoding error with the best max-likelihood code, 
P V (N,,L, 1). Thus, we have 

P,(N,M,1) ^ P f (N,MA) ^ P*(Ni , M, L) + P.(Ni , L, 1) (2) 

where 

AT = Ni + N a + D (3) 

and D is the round-tip delay. 

While our feedback strategy can be evaluated for any forward channel 
for which bounds to P e (N 1 , M, L) and P e (N 2 , L, 1) are known, we 
restrict our attention here to the discrete memoryless channel and to 
the time-discrete, average-power-limited, additive Gaussian noise chan- 
nel. Bounds to these error probabilities for these two channels can be 
found in several places. 710 ' 14 ' 15 However, for easy reference we shall 
refer to Gallager. 10 Gallager does not bound P e (Ni , M, L) in his paper 
but the changes necessary in his analysis to bound it are relatively 
easy to effect and are outlined in the Appendix. These changes are due 
to unpublished results by Gallager* and are such that the random-code 
bound to P e (Ni , M, L) has the same form as Gallager's bound 10 to 
P„ (Ni , M , 1 ) except that his parameter p is allowed to range between 
and L rather than between and 1 . 

The bounds to P t .(iV\ , M, L) and P C {N 2 , L, 1) are given below where 
Ri = (log 2 M)/Ni , R 2 = (log 2 L)/N 2 and O.W, 1 ^ i ^ 2, are quan- 

* See also Ref . 16. 
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tities which approach zero faster than 1/iV,- : 

P (No LI) < 9 _Ar 2 [£ * (R 2 )+ °2 (A '2 )1 



(4) 
(5) 



Here E L {R\) and E x (Ri) are the random-code bound and expurgated 
random-code bound, respectively, to the exponents on the two error 
probabilities. The formal statement of these two exponents for the DMC 
and the AGC is somewhat long. Consequently, we present Table I 
which lists the location of the two exponents by equation number in 
Ref. 10. We note again that E L (Ri) has the same form as Ei(Ri), which 
is the random-code exponent given by Gallager, except that ^ p ^ L. 
Equations (4) and (5) are used to bound (2). The block lengths 



Table I — Location of Exponents in Ref. 10. 




DMC 


AGC 


EURi) 
E,(Rt) 


21,22 
86,87 


125,127,128 
133 



ATi and N 2 of the two codes are chosen before transmission to approxi- 
mately optimize the bounds (4) and (5). That is, we set 



Since N« = (N — D) — Ni we have 

Ni = _0V - D)E X (R 2 ) 



E X (R 2 ) + E L (R 1 ) ' 
Thus, the exponent NiEl (Ri ) becomes 

[N - D)E x (R 2 )E L (R l ] 



(6) 



(7) 



NtEM) = 



E X (R 2 ) + BtiRi) ' 
The signaling rate of the code is defined as R = (log 2 M)/N so that 

D 



(8) 



K l-^R lEx (R 2 ) 

N l E X (R,) 4- E L (Ri) ' 



(9) 



The exponent to the probability of error with our feedback strategy, 
EfiR), is defined as 

log 2 P f 



EAR) = lim - 

JV-oc IX 



(10) 
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so that if we assume that D/N —>0asN increases and if L is independ- 
ent of N (#2->0), then 

E x (0)EM (n) 



where 



RiE x (0) /J2 - ) 

tf,(0) + ^(ft) * 

Hence, #/(#) and ft are parametrized by the rate Ri . 

Let us now consider the list size L and show that it should be made 
independent of N. The list size appears in (11) and (12) through the 
list decoding exponent E L (Ri). As mentioned above, this exponent is 
parametrized with a parameter p, ^ p ^ L. Associated with E L (Ri) 
is a rate R * such that if Ri < R* then E L (R X ) is a straight line of 
slope -L. Also, if Ri > Ri* then E L (Ri) has slope -p, p < L. Thus, 
#i(i2i) is largest for fixed Ri if L is such that fti > R L *. For any i?i , 
the L satisfying this inequality is fixed and finite. Now, examination of 
(11) and (12) will show that E f (R) vs R is largest when E L (Ri) is 
largest so that E f (R ) is maximized over L with a value of L which is 
independent of N. We choose to use that L for which Ri > R L so that 
the arguments made in the previous paragraph hold. 

It can be shown 7 that the exponent E L (Ri) is equal to the sphere- 
packing bound J^p(fii) for R x ^ R L *. Thus, we have, as our final result, 
the achievable (lower) bound, E f (R), to the largest obtainable exponent 
with feedback, E F (R), given below: 

EAR) * EAR) - g gSg.) U3) 

where 

R = MM . (H) 

E z (0) + E ap (R 1 ) 

A simple construction for E f (R) from E x (0) and the sphere-packing 
bound is shown in Fig. 1. 

IV. SOME EXAMPLES 

The exponent E f (R) on the BSC is significantly smaller than the 
exponent implied by Berlekamp's results and for this reason is not shown. 
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Fig. 1 — Construction for E, (R) on AGC, S/N = 256. 

A computer study of E f (R), however, shows that E f (R) is larger than 
the random code bounds with and without expurgation over a middle 
range of rates when the crossover probability p ^ 10 -7 . It is found that 
this range of rates increases with decreasing p. 

The exponent E f (R) is shown in Figs. 2 and 3 for the time-discrete, 
average-power-limited, additive Gaussian noise channel with power 
signal-to-noise ratios of 64 (18.1 dB) and 256 (24.1 dB), respectively. 
In the first case, an improvement is seen over both the expurgated and 
unexpurgated random code exponents for a very large range of rates, 
namely, 0.15 ^ R ^ 1.6, in nats, and in the second case E f (R) is 
larger than the Wyner upper bound 11 to the exponent without feedback, 
E W (R), for a substantial range of rates, namely 0.90 ^ R ^ 1.80. 
Computer calculations have shown that if S/N < 160 (22 db), then 
E f (R) < E W (R) and if S/N < 14 (11.5 dB), then E f (R) g E TC (R), 
which is the random-code exponent without feedback. 
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Fig. 2 — Exponents on AGC, S/N = 64. 



V. CONCLUSIONS 

We have introduced and examined a conceptually simple block-coding, 
feedback strategy. Using this strategy as an example of block-coding 
feedback strategies, we have found the first conclusive evidence of an 
improvement in the reliability of block coding with feedback at non- 
zero information rates on the additive Gaussian noise channel. This 
improvement is measured with the exponent on the probability of error 
and we have shown that an exponent larger than the random-code 
(lower bound) exponent can be obtained on many channels such as the 
relatively clean BSC and AGC and that on at least one channel, the 
AGC with S/N ^ 22 dB, an exponent which is superior to the best no- 
feedback exponent can be achieved. These results have been shown to 
hold when there is a nonzero channel delay as long as that delay does 
not grow as fast as linearly with block length. 

The gain in reliability seen above can be translated into reduced 
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Fig. 3 — Exponents on AGC, S/N = 256. 

equipment costs or increased rate. We have made our comparison of 
coding with and without feedback on the basis of fixed rate and block 
length. 

It is important to emphasize that the improvement in the perform- 
ance of block coding using feedback is due entirely to the fact that the 
codewords in the code are allowed to change with the channel noise. 
In our example, the list formed after the first step changes with the level 
of the channel noise so that the association of codewords in the second 
code with messages on the list becomes channel dependent. 



VI. ACKNOWLEDGMENT 



The author gratefully acknowledges many helpful conversations 
with P. M. Ebert during this work. He also acknowledges his help in 
I he calculation of the Gaussian channel exponents. 



976 THE BELL SYSTEM TECHNICAL JOURNAL, JULY -AUGUST 1966 
APPENDIX 

Equation (1) states the rule for choosing messages mi , m 2 , • • ■ , m L 
to be placed on the list when list decoding of list size L is used and the 
channel sequence Vi is received. An error is made if the transmitted 
message m, say, is not on this list. Using an argument similar to that 
given in Ref. 10, (2) through (6), the probability of error with list 
decoding when message m is received can be stated formally for a par- 
ticular code as 

Pern = E F(vi | x m )* m (v 1 ). 

Vi € Vat, 

Here V Nl is the set of channel sequences {vi} of length Ni , x m is the 
with code word in the code and $ m (vi) is a characteristic function which 
is one if Vi results in a list which does not include m and is zero other- 
wise. Gallager, in unpublished work, has shown that the characteristic 
function ^ m (yi) may be overbounded by the following 

* m(Vl } S \# " " & L\ P{v, | x.)"'+' • • • P(vx [ x m ) ili+ >> J 

where p ^ 0. Carrying out the random code arguments as given in 
Ref. 10 with ^ p ^ L, we have the desired result. 
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